Overview of the TREC 2007 Enterprise Track

نویسندگان

  • Peter Bailey
  • Arjen P. de Vries
  • Nick Craswell
  • Ian Soboroff
چکیده

The collection consists of all the *.csiro.au (public) websites as they appeared in March 2007. The resulting data set consists of 370 715 documents, with total size 4.2 gigabytes. The web crawler visited the outward-facing pages of CSIRO in a fashion similar to the crawl used in CSIRO’s own search engine. In fact, the same crawler technology that CSIRO uses was used to gather the CSIRO documents (http://www.funnelback.com/). The corpus contains approximately 7.9 million hyperlinks, and 95% of pages have one or more outgoing links containing anchor text. One participant extracted email addresses of 3678 individuals, with 38% of documents containing at least one mailto field.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THUIR at TREC 2007: Enterprise Track

We participate in document search and expert search of Enterprise Track in TREC2007. The motive behind the TREC Enterprise Track is to study the issues searching the documents and experts inside an enterprise environment, which has not been sufficiently addressed in research. In document search, we focus on the key overview page pre-selection methods and link analysis algorithms. In expert sear...

متن کامل

Research on Enterprise Track of TREC 2007

We (ICT-CAS team) participated in the Enterprise Track of TREC 2007. This paper reports our experimental results on this track.

متن کامل

UMass at TREC 2006: Enterprise Track

This paper gives an overview of the work done at the University of Massachusetts, Amherst for the TREC 2006 Enterprise track. For the discussion search task, we compare two methods for incorporating thread evidence into the language models of email messages. For the expert finding task, we create implicit expert representations as mixtures of language models from associated documents.

متن کامل

University of Twente at the TREC 2008 Enterprise Track: Using the Global Web as an Expertise Evidence Source

This paper describes the details of our participation in expert search task of the TREC 2007 Enterprise track.

متن کامل

Overview of the TREC 2005 Enterprise Track

The goal of the enterprise track is to conduct experiments with enterprise data — intranet pages, email archives, document repositories — that reflect the experiences of users in real organisations, such that for example, an email ranking technique that is effective here would be a good choice for deployment in a real multi-user email search application. This involves both understanding user ne...

متن کامل

UALR at TREC-ENT 2007

This is the first year we participated in the enterprise track. This year’s enterprise track offered completely new enterprise data and two new tasks. The data offered was the CSIRO Enterprise Research Collection corpus 1 . The two new tasks introduced this year are Expert search and Document search. We participated in both tasks, though Document Search was our primary focus this year. We also ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007